Visualization of code units across disparate systems

ABSTRACT

A visualization tool that provides visibility of the functionality implemented with each system used by an institution(s) at code unit granularity can be used to overcome a variety of challenges that can occur in an environment with disparate systems. The visualization tool discovers and graphically displays functions/procedures/methods (“code units”) that satisfy a set of one or more criteria, as well as attributes of the discovered code units. Furthermore, the visualization tool can automatically provide visual annotations to identify targets for asset maintenance, targets to leverage for other systems, etc.

RELATED APPLICATIONS

This application is a Continuation of and claims the priority benefit ofU.S. application Ser. No. 12/033,079 filed Feb. 19, 2008.

BACKGROUND

Embodiments of the inventive subject matter generally relate to thefield of computers, and, more particularly, to visualization of codeunits across disparate systems. A number of companies use a mixture ofsystems that may include a COBOL system, a PL/I system, a C languagesystem, and Java® language system.

BRIEF SUMMARY

According to one embodiment of the inventive subject matter, a methodcomprises searching across a plurality of disparate systems to find aplurality of code units that satisfy one or more search criteria. Theplurality of disparate systems comprises code of different programminglanguages. Data about each of the plurality of code units is recorded.The data about each of the plurality of code units comprises a code unitidentifier, indication of a programming language, and one or moreattributes. A graphical representation of the plurality of code units isgenerated. The graphical representation comprises graphic elements thatcorrespond to respective ones of the plurality of code units. Thegraphical representation is generated with control points associatedwith the graphic elements. A first of the control points, whenactivated, graphically decomposes a first of the graphic elements toreveal one or more attributes of a first of the plurality of code unitsrepresented by the first graphic element.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

The present embodiments may be better understood, and numerous objects,features, and advantages made apparent to those skilled in the art byreferencing the accompanying drawings.

FIG. 1 depicts a conceptual example of discovering code units andgenerating a graphical representation of the code units.

FIG. 2 depicts a flowchart of example operations for generating agraphical representation of code units across disparate systems thatsatisfy one or more search criteria.

FIG. 3 depicts a flowchart of example operations for discovering codeunits and consuming search result data from multiple search processes.

FIG. 4 depicts a flowchart of example operations for tagging code unitsand parsing discovered code unit data.

FIG. 5 depicts an example computer system.

DETAILED DESCRIPTION

The description that follows includes exemplary systems, methods,techniques, instruction sequences and computer program products thatembody techniques of the present inventive subject matter. However, itis understood that the described embodiments may be practiced withoutthese specific details. For instance, although examples refer to termbased discover technique, embodiments can base discovery of code unitson data types, code unit size, type of output, etc. In other instances,well-known instruction instances, protocols, structures and techniqueshave not been shown in detail in order not to obfuscate the description.

The description uses the terms “asset” and “system.” The term “asset” isused to refer to code. An example of an asset would be the COBOL code onone or more machines of an organization. The asset can be machine code,object code, source code, annotated code, etc. An asset can range from asingle file to several files, including library files. Regardless of thenumber of files, an asset includes any number of functions or procedures(“code units” or “units of code”). In addition, assets of anorganization can include code developed at different times by differentdevelopers. The term “system” is used to refer to a collection of one ormore hardware components and the asset(s) associated with the one ormore hardware components. The one or more hardware components can encodethe asset(s), execute the asset(s), render output from an executinginstance(s) of the asset(s), handle input for an executing instance(s)of the asset(s), etc.

A visualization tool that provides visibility of the functionalityimplemented with each system used by an institution(s) at code unitgranularity can be used to overcome a variety of challenges that canoccur in an environment with disparate systems. The visualization toolwould allow efficient maintenance of systems, update of systems,modification of systems, collaboration between systems of differentinstitutions, etc. A user can use the visualization tool to discover andgraphically display functions/procedures/methods (“code units”) thatsatisfy a set of one or more criteria, as well as attributes of thediscovered code units. The user can then efficiently perform varioustasks. Examples of the various tasks include eliminating redundant codeunits across the disparate systems, adding systems that leverageexisting functionality to deliver new services, etc. For instance, abank may be developing Java code to provide online services tocustomers. The bank maintains account data with COBOL systems. The bankcan use the visualization tool to discover and leverage existingfunctionality in the COBOL systems for access by the Java code toprovide the online services. Furthermore, the visualization tool canautomatically provide visual annotations to identify targets for assetmaintenance, targets to leverage for other systems, etc.

FIG. 1 depicts a conceptual example of discovering code units andgenerating a graphical representation of the code units. Disparatesystems 101 include hardware and a variety of assets. The disparatesystems 101 include a COBOL asset 103, a Java asset 105, a PL/I asset107, and a C asset 109. A disparate systems code unit visualization tool111 accesses the disparate systems 101 to discover code units thatsatisfy search criteria submitted to the disparate system code unitvisualization tool 111. The disparate system code unit visualizationtool 111 comprises a discovery unit 113 and a graphical representationbuilder 117.

In FIG. 1, the disparate system code unit visualization tool 111receives a query “ACCOUNT LOOKUP BY LAST NAME.” at the discovery unit113. The discovery unit 113 searches across the disparate systems 101for one or more code units in the assets 101, 103, 105, 107, and 109that satisfy the submitted search query. Embodiments can use differenttechniques to implement the search functionality. An embodiment cansearch the asset for strings that match the submitted query to differentdegrees (e.g., completely match, partially match, contain at least oneterm, etc.). Another embodiment can utilize one or more heuristics forthe search to find code units that satisfy the submitted query. Forinstance, the discovery unit 113 can use the one or more heuristics todiscover code units that have abbreviated versions of one or more of thesearch terms. Embodiments can use standard programming semantics todistinguish input variables, output variables, code unit names, etc.Embodiment can also pre-process the assets to assist in the discoveryprocess. For example, tagged versions of the assets may be created withmetadata or symbols that identify code unit names and attributes of thecode units (e.g., code unit input, code unit output, data types, etc.)(“code unit elements”). As another example, code unit elements may beextracted from the assets, then delimited and/or tagged. Although FIG. 1depicts the discovery unit 111 performing the discovery, the discoveryunit 111 can invoke processes or daemons tailored to search particularones of the assets (e.g., a COBOL asset search process, a PL/I assetsearch process, etc.). In addition, the discovery unit 113 can apply areverse compiler to an asset that includes machine code.

During or after the discovery process, the discovery unit 113 recordsthe results of the discovery process or search as discovery data 115.The depiction of the discovery data 115 within the disparate systemscode unit visualization tool 111 is not intended to convey that thediscovery data is recorded within the space of the tool 111, although itcould be recorded to memory space owned by the instance of the tool 111.The discovery unit 113 can write the discovery results into a singlefile, multiple files, etc. The discovery unit 113 can record the resultsin accordance with different techniques to communicate the differentelements of the discovered code units that satisfy the search criteria.Examples of the different techniques include tagging the results withmetadata, recording the results with delimiters, recording the resultswith a particular structure, etc.

The graphical representation builder 117 reads the discovery data 115,and builds a structure with the discovery data 115. The graphicalrepresentation builder 117 builds a structure that associates graphicalelements with the code units elements indicated in the discovery data115. The disparate systems code unit visualization tool 111 thensupplies the structure built by the graphical representation builder 117for display.

A graphical representation 119 of the structure supplied by thedisparate systems code unit visualization tool 111 is depicted as a treerepresentation in FIG. 1. In FIG. 1, the graphical representation 119includes a root graphic element that displays “RESULTS.” Beneath theroot graphic element, the graphical representation 119 includes graphicelements that represent four code unit discovered as satisfying thesearch criteria. The graphic elements display the following code unitnames: “AccountLookupLname,” “AccountByLastName,” “FINDACCNTLASTNAME,”and “AccntLookUpLastName.” Each of the code unit name graphic elementsalso display the relevant asset. The AccountLookupLname graphic elementand the AccountByLastName graphic element indicate “(COBOL).” TheFINDACCNTLASTNAME graphic element indicates “(PL/I).” TheAccntLookUpLastName graphic element indicates “(Java).” The data thatidentifies the asset—in this example programming language of therelevant asset—can be hidden, stored as an attribute of the code unit,stored as a property of the graphic element, etc.

Each of the graphic elements that display the code unit names areassociated with graphic elements that respond to activation (“controlpoints”) to graphically decompose the code unit name graphic elements.Control points 120, 125, 127, and 129 are respectively associated withthe AccountLookupLname graphic element, the AccountByLastName graphicelement, the FINDACCNTLASTNAME graphic element, and theAccntLookUpLastName graphic element. The control points 125, 127, and129 are depicted as pointing to the right to convey that the controlpoints are not activated. The control point 120 is illustrated aspointing down to convey that the control point 120 has been activated tographically decompose the AccountLookupLname graphic element. Activationof the control point 120 decomposes the AccountLookupLname graphicelement to reveal graphical representation of two attributes of the codeunit AccountLookupLname. The decomposition reveals input and outputattributes of the code unit with an “INPUT(S)” graphic element and an“OUTPUT(S)” graphic element.

Each of the attribute graphic elements are also associated with controlpoints. The “INPUT(S)” graphic element is associated with a controlpoint 121, which is not activated. The “OUTPUT(S)” graphic element isassociated with a control point 123, which is activated. Activation ofthe control point 123 graphical decomposes the “OUTPUT(S)” graphicelement to reveal a code unit element, which is the output variable“AccountNumber.” The code unit element “AccountNumber” is graphicallyrepresented with a graphic element that displays ‘AccountNumber” beneaththe “OUTPUT(S)” graphic element.

With the example depicted in FIG. 1, a user could eliminate the PL/Icode unit and the “AccountByLastName” code unit as redundant elements.The visualization tool 111 can record references (e.g., pointers) to thecode units and/or calls to the code units, and provide an option orcommand to eliminate code units. In response to selection of the commandor option, the visualization tool 111 can perform various operations tocarry out elimination of the code units and/or calls to the code units.Examples of operations include commenting out the code units and/orcalls to the code units, deleting the calls to the code units,generating a request to cull the code units, etc. In anotherembodiments, the visualization tool 111 accesses the assets and displaysthe codes units selected for elimination to allow the user to make editsto the assets.

A user can also use the visualization tool 111 to leverage existingsystems. The user could decide to call the AccountLookUpLName code unitfrom another asset. The user activates the control point 121 to revealthe input parameters of the code unit to properly pass values/parametersand invoke the code unit.

The example in FIG. 1 assumes that the discovery unit applies a filterto the assets based on one or more search terms, but a filter can becreated based on predefined options, user specific configurations, etc.Embodiments can present a user with a menu of options for selectioninstead of or in addition to the user entering terms. For example, auser can select to search on one or more code unit elements (e.g., codeunit name, input, output, etc.). The user can then add one or moresearch terms to apply to the selected code unit element.

FIG. 2 depicts a flowchart of example operations for generating agraphical representation of code units across disparate systems thatsatisfy one or more search criteria. At block 201, one or more searchcriteria are received. At block 203, a plurality of code units storedamong a plurality of disparate systems is identified in accordance withthe search criteria. Embodiments can set various thresholds for thesatisfaction of the one or more search criteria to perform discovery.For instance, an embodiment can disregard articles and discover a codeunit that only satisfies 60% of the non-article terms, assuming a termbased discovery operation. As stated above, embodiments are not limitedto term based discovery. For example, embodiments can search for codeunits that use encryption functionality.

At block 205, a code unit identifier, an indication of programminglanguage, and one or more attributes (e.g., input variables, outputvariables, internally referenced code units, code unit size, etc.) arerecorded for each of the discovered plurality of code units.

At block 207, a graphical representation of the discovered plurality ofcode units is generated. The graphical representation comprises graphicelements (e.g., folder icons, basic shapes, etc.) that representelements of the plurality of code units. The graphical representationcomprises graphic elements that represent the code units (“code unitgraphic elements”) and graphic elements that represent each of theattributes of the code units (“attribute graphic elements”). Theindication of programming language can also be associated with a graphicelement, which can be displayed with various techniques. Examples ofdisplay techniques include displaying a programming language indicationgraphic element in response to a variety of requests (e.g., selection ofan option in a menu, selecting a button, etc.), defining a property ofthe code unit graphic elements with the programming language indication(e.g., defining different colors to represent different programminglanguages, etc.). Embodiments can also display text to convey theindication of programming language. The graphic elements are associatedwith the control points to hierarchically decompose the graphicalrepresentation. A control point graphically decomposes a graphic elementto reveal child graphic elements in response to activation of thecontrol point. \

To generate the graphical representation, embodiments can consume therecorded data for discovered code unit in accordance with differenttechniques. Examples of the different techniques include a singleprocess serially searching assets and writing to a single file, multipleprocessor or sub-processes searching assets in parallel and writing toindividual files or a shared file, writing references to relevantportions of the assets, etc.

FIG. 3 depicts a flowchart of example operations for discovering codeunits and consuming search result data from multiple search processes.At block 301, one or more search criteria are received. At block 303,asset specific search processes are invoked or launched with thereceived search criteria. After the search result data is received fromthe asset specific search processes, the search result data is parsedinto code unit elements (i.e., code unit identifier and one or moreattributes) and associated with graphic elements. A discovery unit ormodule reads the search result data and parses the search result databased on occurrence of delimiters. For example, the discovery unit canread the search result data with an understanding that code unitelements will be provided in a given order. Further, particulardelimiters can be used to distinguish different types of code elements,different code units, and multiple occurrences of a given type of codeunit element (e.g., multiple input variables). As another example, thesearch result data can be tagged with metadata to distinguish thedifferent code unit elements and multiple occurrences of a code unitelement. Embodiment can also task the asset specific search processeswith creation of hierarchical structures to represent the results, andtask the discovery unit with aggregating the different structuresprovided by the asset specific search processes.

At block 307, a hierarchical representation structure is built with theparsed code unit elements. As stated above, asset specific hierarchicalstructures may have already been built. Operations can then be performedto aggregate the multiple structures into a single structure. At block309, the hierarchical representation structure is supplied for display.

Embodiments can employ different techniques to parse discovery data.Embodiment can search raw assets and/or search prepared assets (e.g.,tagged assets or tagged copies of assets). An embodiment can crawl overall assets or copies of assets and tag the assets with metadata toindicate code unit elements. An embodiment can crawl over assets andcatalog code units and code unit elements. An embodiment canincrementally catalog code units. For example, each time a discovery isperformed, the recorded data is stored to gradually build a database ofcode unit elements for visualization of code units across disparatesystems.

FIG. 4 depicts a flowchart of example operations for tagging code unitsand parsing discovered code unit data. At block 401, one or more searchcriteria are received. At block 403, it is determined if the targetasset includes a code unit that has not been tagged. If the target assetincludes a code unit that has not been tagged, then control flows toblock 405. If the target asset does not include a code unit that has notbeen tagged, then control flows to block 407.

At block 405, each code unit that has not been tagged is parsed andtagged to indicate elements of the code unit. Control flows from block405 to block 407.

At block 407, the first code unit of the target asset is searched for anelement that satisfies some or all of the one or more search criteria.At block 409, it is determined if an element of the searched code unitwas discovered. If an element was discovered, then control flows toblock 411. If an element was not discovered, then control flows to block413.

At block 411, data for all elements of the code unit indicated withtags. Of course, it is not necessary for embodiments to record data forall elements indicated with tags. A tool can record data for only someof the elements of the code unit. For example, a tool can be configuredto record all data in one discovery and only code unit name and I/Ovariable for a different discovery. Control flows from block 411 toblock 413.

At block 413, it is determined if the asset comprises additional codeunits. If so, then control flows to block 415. If not, then controlflows to block 417.

At block 413, a next code unit of the target asset is searched for anelement that satisfies some or all of the received search criteria.Control flows from block 415 back to block 409.

If it was determined there were no additional code units at block 413,then the result(s) of the search are returned at block 417.

It should be understood that the depicted flowchart are examples meantto aid in understanding embodiments and should not be used to limitembodiments or limit scope of the claims. Embodiments may performadditional operations, fewer operations, operations in a differentorder, operations in parallel, and some operations differently. Forinstance, referring to FIG. 3, additional operations can be performed tomap the element of the hierarchical representation structure to graphicelements. For example, elements can be associated with different iconsbased on level within the hierarchical structure. In another embodiment,operations are performed to map each of the elements of the hierarchicalstructure to graphic elements prior to being supplied for display.Referring to FIG. 2, operations to record data and build the graphicalrepresentation can be performed concurrently. As data is recorded, thegraphical representation or the hierarchical representation structurecan be built. As another example, operations to build the graphicalrepresentation or hierarchical structure can include recording data(e.g., the data is recorded into the hierarchical structure).

Embodiments of the inventive subject matter may be provided as acomputer program product, or software, that may include amachine-readable medium having stored thereon instructions (e.g.,firmware, resident software, micro-code, etc.), which may be used toprogram a computer system (or other electronic device(s)) to perform aprocess(es) according to embodiments, whether presently described or notsince every conceivable variation is not enumerated herein. A machinereadable medium includes any mechanism or article of manufacture forstoring or transmitting information in a form (e.g., software,processing application) readable by a machine (e.g., a computer, amobile phone, etc.). The machine-readable medium may include, but is notlimited to, magnetic storage medium (e.g., floppy diskette); opticalstorage medium (e.g., CD-ROM); magneto-optical storage medium; read onlymemory (ROM); random access memory (RAM); erasable programmable memory(e.g., EPROM and EEPROM); flash memory; or other types of mediumsuitable for storing executable instructions. In addition, embodimentsmay be embodied in an electrical, optical, acoustical or other form ofpropagated signal (e.g., carrier waves, infrared signals, digitalsignals, etc.), or wireline, wireless, or other communications medium.

Computer program code for carrying out operations of the presentinventive subject matter may be written in any combination of one ormore programming languages, including an object oriented programminglanguage such as Java, Smalltalk, C++ or the like and conventionalprocedural programming languages, such as the “C” programming languageor similar programming languages. The program code may execute entirelyon the user's computer, partly on the user's computer, as a stand-alonesoftware package, partly on the user's computer and partly on a remotecomputer or entirely on the remote computer or server. In the latterscenario, the remote computer may be connected to the user's computerthrough any type of network, including a local area network (LAN) or awide area network (WAN), or the connection may be made to an externalcomputer (for example, through the Internet using an Internet ServiceProvider).

It will be understood that each block of the above flowchartillustrations and/or block diagrams, combinations of blocks in theflowchart illustrations and/or block diagrams, can be implemented bycomputer program instructions. These computer program instructions maybe provided to a processor of a general purpose computer, specialpurpose computer, or other programmable data processing apparatus toproduce a machine, such that the instructions, which execute via theprocessor of the computer or other programmable data processingapparatus, create means for implementing the functions/acts specified inthe flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the Figures illustrate thearchitecture, functionality, and operation of possible implementationsof systems, methods and computer program products according to variousembodiments of the present inventive subject matter. In this regard,each block in the flowchart or block diagrams may represent a module,segment, or portion of code, which comprises one or more executableinstructions for implementing the specified logical function(s).

FIG. 5 depicts an example computer system. A computer system includes aprocessor unit 501 (possibly including multiple processors, multiplecores, multiple nodes, and/or implementing multi-threading, etc.). Thecomputer system includes memory 507. The memory 507 may be system memory(e.g., one or more of cache, SRAM, DRAM, zero capacitor RAM, TwinTransistor RAM, eDRAM, EDO RAM, DDR RAM, EEPROM, NRAM, RRAM, SONOS,PRAM, etc.) and/or any one or more of the above already describedpossible realizations of machine-readable media. The computer systemalso includes a bus 503 (e.g., PCI, ISA, PCI-Express, HyperTransport®,InfiniBand®, NuBus, etc.), a network interface 509 (e.g., an ATMinterface, an Ethernet interface, a Frame Relay interface, SONETinterface, wireless interface, etc.), and a storage device(s) 511 (e.g.,optical storage, magnetic storage, etc.). The computer system alsoincludes a disparate system code unit visualization tool 521 thatperforms the functionalities for discovering code units across disparatesystems and causing visualization of the discovered code units. Some orall of the functionality performed by the tool 521 can be implementedwith an application specific integrated circuit, in logic implemented inthe processing unit 501, in a co-processor on a peripheral device orcard, etc. Moreover, some or all of the functionality performed by thetool 521 can be embodied in code stored on one or more of the memory507, the processor unit 501, a co-processor, and the storage device 511.Further, realizations may include fewer or additional components notillustrated in FIG. 5 (e.g., video cards, audio cards, additionalnetwork interfaces, peripheral devices, etc.). The processor unit 501,the storage device(s) 511, the disparate system code unit visualizationtool 521, and the network interface 509 are coupled to the bus 503.Although illustrated as being coupled to the bus 503, the memory 507 maybe coupled to the processor unit 501.

While the embodiments are described with reference to variousimplementations and exploitations, it will be understood that theseembodiments are illustrative and that the scope of the inventive subjectmatter is not limited to them. In general, techniques for visualizingcode units of disparate systems as described herein may be implementedwith facilities consistent with any hardware system or hardware systems.Many variations, modifications, additions, and improvements arepossible.

The corresponding structures, materials, acts, and equivalents of allmeans or step plus function elements in the claims below are intended toinclude any structure, material, or act for performing the function incombination with other claimed elements as specifically claimed. Thedescription of the present embodiments has been presented for purposesof illustration and description, but is not intended to be exhaustive orlimited to the embodiments in the form disclosed.

Plural instances may be provided for components, operations orstructures described herein as a single instance. Finally, boundariesbetween various components, operations and data stores are somewhatarbitrary, and particular operations are illustrated in the context ofspecific illustrative configurations. Other allocations of functionalityare envisioned and may fall within the scope of the inventive subjectmatter. In general, structures and functionality presented as separatecomponents in the example configurations may be implemented as acombined structure or component. Similarly, structures and functionalitypresented as a single component may be implemented as separatecomponents. These and other variations, modifications, additions, andimprovements may fall within the scope of the inventive subject matter.

1. A method comprising: identifying at least one code unit of aplurality of code units stored within a plurality of disparate systems,said identifying in accordance with one or more criteria, wherein theplurality of code units comprises code of different programminglanguages; storing data corresponding to each of said at least one codeunit, wherein the data comprises a code unit identifier, indication of aprogramming language, and one or more attributes; and generating agraphical representation of the at least one code unit, wherein thegraphical representation comprises graphic elements that indicate thecode unit identifier and the one or more attributes, wherein thegraphical representation comprises a control point that, when activated,graphically decomposes a first of the graphic elements to reveal asecond of the graphic elements that indicates the one or more attributesof the at least one code unit.
 2. The method of claim 1, wherein saididentifying comprises invoking a plurality of asset specific searchprocesses with the one or more criteria.
 3. The method of claim 2,wherein a first of the plurality of asset specific search processessearches an asset of a first programming language.
 4. The method ofclaim 1, wherein each of the plurality of disparate systems comprises anasset coded in accordance with one or more programming languages.
 5. Themethod of claim 4, wherein the plurality of assets comprises a firstasset with COBOL code and a second asset with code of an object orientedprogramming language.
 6. The method of claim 1, wherein the one or moreattributes comprises at least one of input, output, data type, and codeunit size.
 7. The method of claim 1 further comprising building ahierarchical representation with the data, wherein the graphicalrepresentation is generated based, at least in part, on the hierarchicalrepresentation.
 8. The method of claim 1 further comprisingautomatically identifying redundant code units that occur within asingle one of the disparate systems.